Overview
Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 500000 |
| Missing cells | 125000 |
| Missing cells (%) | 1.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 57.2 MiB |
| Average record size in memory | 120.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 6 |
| DateTime | 1 |
age has 25000 (5.0%) missing values | Missing |
gender has 25000 (5.0%) missing values | Missing |
employment_type has 25000 (5.0%) missing values | Missing |
annual_income has 25000 (5.0%) missing values | Missing |
credit_score has 25000 (5.0%) missing values | Missing |
customer_id is uniformly distributed | Uniform |
customer_id has unique values | Unique |
repayment_history has 67883 (13.6%) zeros | Zeros |
Reproduction
| Analysis started | 2026-02-23 04:17:27.089989 |
|---|---|
| Analysis finished | 2026-02-23 04:17:53.897472 |
| Duration | 26.81 seconds |
| Software version | ydata-profiling vv4.18.1 |
| Download configuration | config.json |
Variables
customer_id
Real number (ℝ)
Uniform Unique
| Distinct | 500000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 349999.5 |
| Minimum | 100000 |
|---|---|
| Maximum | 599999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 100000 |
|---|---|
| 5-th percentile | 124999.95 |
| Q1 | 224999.75 |
| median | 349999.5 |
| Q3 | 474999.25 |
| 95-th percentile | 574999.05 |
| Maximum | 599999 |
| Range | 499999 |
| Interquartile range (IQR) | 249999.5 |
Descriptive statistics
| Standard deviation | 144337.71 |
|---|---|
| Coefficient of variation (CV) | 0.41239405 |
| Kurtosis | -1.2 |
| Mean | 349999.5 |
| Median Absolute Deviation (MAD) | 125000 |
| Skewness | 8.7177233 × 10-17 |
| Sum | 1.7499975 × 1011 |
| Variance | 2.0833375 × 1010 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 100000 | 1 | < 0.1% |
| 100001 | 1 | < 0.1% |
| 100002 | 1 | < 0.1% |
| 100003 | 1 | < 0.1% |
| 100004 | 1 | < 0.1% |
| 100005 | 1 | < 0.1% |
| 100006 | 1 | < 0.1% |
| 100007 | 1 | < 0.1% |
| 100008 | 1 | < 0.1% |
| 100009 | 1 | < 0.1% |
| Other values (499990) | 499990 |
| Value | Count | Frequency (%) |
| 100000 | 1 | |
| 100001 | 1 | |
| 100002 | 1 | |
| 100003 | 1 | |
| 100004 | 1 | |
| 100005 | 1 | |
| 100006 | 1 | |
| 100007 | 1 | |
| 100008 | 1 | |
| 100009 | 1 |
| Value | Count | Frequency (%) |
| 599999 | 1 | |
| 599998 | 1 | |
| 599997 | 1 | |
| 599996 | 1 | |
| 599995 | 1 | |
| 599994 | 1 | |
| 599993 | 1 | |
| 599992 | 1 | |
| 599991 | 1 | |
| 599990 | 1 |
age
Real number (ℝ)
Missing
| Distinct | 49 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 25000 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45.011236 |
| Minimum | 21 |
|---|---|
| Maximum | 69 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 21 |
|---|---|
| 5-th percentile | 23 |
| Q1 | 33 |
| median | 45 |
| Q3 | 57 |
| 95-th percentile | 67 |
| Maximum | 69 |
| Range | 48 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 14.134525 |
|---|---|
| Coefficient of variation (CV) | 0.31402215 |
| Kurtosis | -1.1992601 |
| Mean | 45.011236 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -0.0016654567 |
| Sum | 21380337 |
| Variance | 199.78479 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 43 | 9885 | 2.0% |
| 65 | 9883 | 2.0% |
| 37 | 9864 | 2.0% |
| 56 | 9860 | 2.0% |
| 35 | 9838 | 2.0% |
| 36 | 9820 | 2.0% |
| 42 | 9818 | 2.0% |
| 25 | 9797 | 2.0% |
| 57 | 9793 | 2.0% |
| 58 | 9790 | 2.0% |
| Other values (39) | 376652 | |
| (Missing) | 25000 | 5.0% |
| Value | Count | Frequency (%) |
| 21 | 9722 | |
| 22 | 9789 | |
| 23 | 9679 | |
| 24 | 9561 | |
| 25 | 9797 | |
| 26 | 9513 | |
| 27 | 9610 | |
| 28 | 9649 | |
| 29 | 9570 | |
| 30 | 9674 |
| Value | Count | Frequency (%) |
| 69 | 9578 | |
| 68 | 9778 | |
| 67 | 9761 | |
| 66 | 9517 | |
| 65 | 9883 | |
| 64 | 9706 | |
| 63 | 9453 | |
| 62 | 9619 | |
| 61 | 9756 | |
| 60 | 9774 |
gender
Categorical
Missing
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 25000 |
| Missing (%) | 5.0% |
| Memory size | 3.8 MiB |
| Female | |
|---|---|
| Male | |
| Other | 18952 |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 5.0005516 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Female |
|---|---|
| 2nd row | Female |
| 3rd row | Female |
| 4th row | Female |
| 5th row | Female |
Common Values
| Value | Count | Frequency (%) |
| Female | 228155 | |
| Male | 227893 | |
| Other | 18952 | 3.8% |
| (Missing) | 25000 | 5.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| female | 228155 | |
| male | 227893 | |
| other | 18952 | 4.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 703155 | |
| a | 456048 | |
| l | 456048 | |
| F | 228155 | 9.6% |
| m | 228155 | 9.6% |
| M | 227893 | 9.6% |
| O | 18952 | 0.8% |
| t | 18952 | 0.8% |
| h | 18952 | 0.8% |
| r | 18952 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2375262 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 703155 | |
| a | 456048 | |
| l | 456048 | |
| F | 228155 | 9.6% |
| m | 228155 | 9.6% |
| M | 227893 | 9.6% |
| O | 18952 | 0.8% |
| t | 18952 | 0.8% |
| h | 18952 | 0.8% |
| r | 18952 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2375262 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 703155 | |
| a | 456048 | |
| l | 456048 | |
| F | 228155 | 9.6% |
| m | 228155 | 9.6% |
| M | 227893 | 9.6% |
| O | 18952 | 0.8% |
| t | 18952 | 0.8% |
| h | 18952 | 0.8% |
| r | 18952 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2375262 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 703155 | |
| a | 456048 | |
| l | 456048 | |
| F | 228155 | 9.6% |
| m | 228155 | 9.6% |
| M | 227893 | 9.6% |
| O | 18952 | 0.8% |
| t | 18952 | 0.8% |
| h | 18952 | 0.8% |
| r | 18952 | 0.8% |
region
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.8 MiB |
| South | |
|---|---|
| East | |
| West | |
| North |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.499318 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | East |
|---|---|
| 2nd row | South |
| 3rd row | North |
| 4th row | North |
| 5th row | East |
Common Values
| Value | Count | Frequency (%) |
| South | 125341 | |
| East | 125283 | |
| West | 125058 | |
| North | 124318 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| south | 125341 | |
| east | 125283 | |
| west | 125058 | |
| north | 124318 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 500000 | |
| s | 250341 | |
| h | 249659 | |
| o | 249659 | |
| u | 125341 | 5.6% |
| S | 125341 | 5.6% |
| E | 125283 | 5.6% |
| a | 125283 | 5.6% |
| W | 125058 | 5.6% |
| e | 125058 | 5.6% |
| Other values (2) | 248636 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2249659 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 500000 | |
| s | 250341 | |
| h | 249659 | |
| o | 249659 | |
| u | 125341 | 5.6% |
| S | 125341 | 5.6% |
| E | 125283 | 5.6% |
| a | 125283 | 5.6% |
| W | 125058 | 5.6% |
| e | 125058 | 5.6% |
| Other values (2) | 248636 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2249659 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 500000 | |
| s | 250341 | |
| h | 249659 | |
| o | 249659 | |
| u | 125341 | 5.6% |
| S | 125341 | 5.6% |
| E | 125283 | 5.6% |
| a | 125283 | 5.6% |
| W | 125058 | 5.6% |
| e | 125058 | 5.6% |
| Other values (2) | 248636 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2249659 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 500000 | |
| s | 250341 | |
| h | 249659 | |
| o | 249659 | |
| u | 125341 | 5.6% |
| S | 125341 | 5.6% |
| E | 125283 | 5.6% |
| a | 125283 | 5.6% |
| W | 125058 | 5.6% |
| e | 125058 | 5.6% |
| Other values (2) | 248636 |
education_level
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.8 MiB |
| Graduate | |
|---|---|
| Secondary | |
| Post-Graduate | |
| Primary |
Length
| Max length | 13 |
|---|---|
| Median length | 9 |
| Mean length | 8.953012 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Secondary |
|---|---|
| 2nd row | Graduate |
| 3rd row | Secondary |
| 4th row | Secondary |
| 5th row | Graduate |
Common Values
| Value | Count | Frequency (%) |
| Graduate | 175082 | |
| Secondary | 174326 | |
| Post-Graduate | 75462 | |
| Primary | 75130 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| graduate | 175082 | |
| secondary | 174326 | |
| post-graduate | 75462 | |
| primary | 75130 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 750544 | |
| r | 575130 | |
| d | 424870 | |
| e | 424870 | |
| t | 326006 | |
| G | 250544 | 5.6% |
| u | 250544 | 5.6% |
| o | 249788 | 5.6% |
| y | 249456 | 5.6% |
| c | 174326 | 3.9% |
| Other values (7) | 800428 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 4476506 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 750544 | |
| r | 575130 | |
| d | 424870 | |
| e | 424870 | |
| t | 326006 | |
| G | 250544 | 5.6% |
| u | 250544 | 5.6% |
| o | 249788 | 5.6% |
| y | 249456 | 5.6% |
| c | 174326 | 3.9% |
| Other values (7) | 800428 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 4476506 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 750544 | |
| r | 575130 | |
| d | 424870 | |
| e | 424870 | |
| t | 326006 | |
| G | 250544 | 5.6% |
| u | 250544 | 5.6% |
| o | 249788 | 5.6% |
| y | 249456 | 5.6% |
| c | 174326 | 3.9% |
| Other values (7) | 800428 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 4476506 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 750544 | |
| r | 575130 | |
| d | 424870 | |
| e | 424870 | |
| t | 326006 | |
| G | 250544 | 5.6% |
| u | 250544 | 5.6% |
| o | 249788 | 5.6% |
| y | 249456 | 5.6% |
| c | 174326 | 3.9% |
| Other values (7) | 800428 |
employment_type
Categorical
Missing
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 25000 |
| Missing (%) | 5.0% |
| Memory size | 3.8 MiB |
| Salaried | |
|---|---|
| Self-Employed | |
| Unemployed |
Length
| Max length | 13 |
|---|---|
| Median length | 8 |
| Mean length | 9.5491937 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Salaried |
|---|---|
| 2nd row | Self-Employed |
| 3rd row | Salaried |
| 4th row | Salaried |
| 5th row | Self-Employed |
Common Values
| Value | Count | Frequency (%) |
| Salaried | 285466 | |
| Self-Employed | 118933 | |
| Unemployed | 70601 | 14.1% |
| (Missing) | 25000 | 5.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| salaried | 285466 | |
| self-employed | 118933 | |
| unemployed | 70601 | 14.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 664534 | |
| l | 593933 | |
| a | 570932 | |
| d | 475000 | |
| S | 404399 | |
| r | 285466 | |
| i | 285466 | |
| p | 189534 | 4.2% |
| y | 189534 | 4.2% |
| m | 189534 | 4.2% |
| Other values (6) | 687535 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 4535867 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 664534 | |
| l | 593933 | |
| a | 570932 | |
| d | 475000 | |
| S | 404399 | |
| r | 285466 | |
| i | 285466 | |
| p | 189534 | 4.2% |
| y | 189534 | 4.2% |
| m | 189534 | 4.2% |
| Other values (6) | 687535 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 4535867 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 664534 | |
| l | 593933 | |
| a | 570932 | |
| d | 475000 | |
| S | 404399 | |
| r | 285466 | |
| i | 285466 | |
| p | 189534 | 4.2% |
| y | 189534 | 4.2% |
| m | 189534 | 4.2% |
| Other values (6) | 687535 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 4535867 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 664534 | |
| l | 593933 | |
| a | 570932 | |
| d | 475000 | |
| S | 404399 | |
| r | 285466 | |
| i | 285466 | |
| p | 189534 | 4.2% |
| y | 189534 | 4.2% |
| m | 189534 | 4.2% |
| Other values (6) | 687535 |
annual_income
Real number (ℝ)
Missing
| Distinct | 473709 |
|---|---|
| Distinct (%) | 99.7% |
| Missing | 25000 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 549733.39 |
| Minimum | 15700.25 |
|---|---|
| Maximum | 16228345 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 15700.25 |
|---|---|
| 5-th percentile | 165058.11 |
| Q1 | 296246.75 |
| median | 445371.72 |
| Q3 | 670686.66 |
| 95-th percentile | 1244206.6 |
| Maximum | 16228345 |
| Range | 16212645 |
| Interquartile range (IQR) | 374439.91 |
Descriptive statistics
| Standard deviation | 439483.33 |
|---|---|
| Coefficient of variation (CV) | 0.79944812 |
| Kurtosis | 75.619804 |
| Mean | 549733.39 |
| Median Absolute Deviation (MAD) | 173902.5 |
| Skewness | 5.5248683 |
| Sum | 2.6112336 × 1011 |
| Variance | 1.931456 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 555412.42 | 3 | < 0.1% |
| 431574.26 | 3 | < 0.1% |
| 640402.61 | 3 | < 0.1% |
| 597865.84 | 3 | < 0.1% |
| 351330.78 | 3 | < 0.1% |
| 533790.49 | 2 | < 0.1% |
| 441238.82 | 2 | < 0.1% |
| 223062.09 | 2 | < 0.1% |
| 207109.28 | 2 | < 0.1% |
| 246767.02 | 2 | < 0.1% |
| Other values (473699) | 474975 | |
| (Missing) | 25000 | 5.0% |
| Value | Count | Frequency (%) |
| 15700.25 | 1 | |
| 28921.66 | 1 | |
| 29899.25 | 1 | |
| 30214.11 | 1 | |
| 31358.69 | 1 | |
| 33013.98 | 1 | |
| 33905.19 | 1 | |
| 34703.65 | 1 | |
| 35008.55 | 1 | |
| 35364.69 | 1 |
| Value | Count | Frequency (%) |
| 16228345.4 | 1 | |
| 14791672.27 | 1 | |
| 14643400.72 | 1 | |
| 14329300.3 | 1 | |
| 13790718.38 | 1 | |
| 13676246.71 | 1 | |
| 13644771.71 | 1 | |
| 13576957.99 | 1 | |
| 13188909.65 | 1 | |
| 13001217.01 | 1 |
loan_amount
Real number (ℝ)
| Distinct | 496793 |
|---|---|
| Distinct (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 224278.75 |
| Minimum | 3331.74 |
|---|---|
| Maximum | 7449221.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 3331.74 |
|---|---|
| 5-th percentile | 43787.844 |
| Q1 | 95226.975 |
| median | 162964.3 |
| Q3 | 279429.81 |
| 95-th percentile | 606722.88 |
| Maximum | 7449221.5 |
| Range | 7445889.8 |
| Interquartile range (IQR) | 184202.83 |
Descriptive statistics
| Standard deviation | 212115.15 |
|---|---|
| Coefficient of variation (CV) | 0.9457657 |
| Kurtosis | 30.472487 |
| Mean | 224278.75 |
| Median Absolute Deviation (MAD) | 81174.335 |
| Skewness | 3.6908967 |
| Sum | 1.1213938 × 1011 |
| Variance | 4.4992838 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 110454.33 | 3 | < 0.1% |
| 89797.51 | 3 | < 0.1% |
| 310885.66 | 3 | < 0.1% |
| 375241.4 | 3 | < 0.1% |
| 75520.62 | 3 | < 0.1% |
| 133367.09 | 3 | < 0.1% |
| 119189.81 | 3 | < 0.1% |
| 66912.4 | 3 | < 0.1% |
| 139348.45 | 3 | < 0.1% |
| 213977.12 | 3 | < 0.1% |
| Other values (496783) | 499970 |
| Value | Count | Frequency (%) |
| 3331.74 | 1 | |
| 3470.64 | 1 | |
| 3749.43 | 1 | |
| 4461.16 | 1 | |
| 4909.59 | 1 | |
| 5214.44 | 1 | |
| 5468.18 | 1 | |
| 5746.19 | 1 | |
| 5916.13 | 1 | |
| 5990.34 | 1 |
| Value | Count | Frequency (%) |
| 7449221.52 | 1 | |
| 7121808.62 | 1 | |
| 4982447.31 | 1 | |
| 4647066.59 | 1 | |
| 4641858.47 | 1 | |
| 4486562.43 | 1 | |
| 4457180.62 | 1 | |
| 4303462.2 | 1 | |
| 4254652.57 | 1 | |
| 4218754.74 | 1 |
loan_purpose
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.8 MiB |
| Education | |
|---|---|
| Business | |
| Other | |
| Car | |
| Home |
Length
| Max length | 9 |
|---|---|
| Median length | 5 |
| Mean length | 5.806036 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Education |
|---|---|
| 2nd row | Car |
| 3rd row | Home |
| 4th row | Car |
| 5th row | Car |
Common Values
| Value | Count | Frequency (%) |
| Education | 100418 | |
| Business | 100220 | |
| Other | 100204 | |
| Car | 100156 | |
| Home | 99002 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| education | 100418 | |
| business | 100220 | |
| other | 100204 | |
| car | 100156 | |
| home | 99002 |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 300660 | 10.4% |
| e | 299426 | 10.3% |
| n | 200638 | 6.9% |
| i | 200638 | 6.9% |
| u | 200638 | 6.9% |
| t | 200622 | 6.9% |
| a | 200574 | 6.9% |
| r | 200360 | 6.9% |
| o | 199420 | 6.9% |
| E | 100418 | 3.5% |
| Other values (8) | 799624 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2903018 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| s | 300660 | 10.4% |
| e | 299426 | 10.3% |
| n | 200638 | 6.9% |
| i | 200638 | 6.9% |
| u | 200638 | 6.9% |
| t | 200622 | 6.9% |
| a | 200574 | 6.9% |
| r | 200360 | 6.9% |
| o | 199420 | 6.9% |
| E | 100418 | 3.5% |
| Other values (8) | 799624 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2903018 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| s | 300660 | 10.4% |
| e | 299426 | 10.3% |
| n | 200638 | 6.9% |
| i | 200638 | 6.9% |
| u | 200638 | 6.9% |
| t | 200622 | 6.9% |
| a | 200574 | 6.9% |
| r | 200360 | 6.9% |
| o | 199420 | 6.9% |
| E | 100418 | 3.5% |
| Other values (8) | 799624 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2903018 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| s | 300660 | 10.4% |
| e | 299426 | 10.3% |
| n | 200638 | 6.9% |
| i | 200638 | 6.9% |
| u | 200638 | 6.9% |
| t | 200622 | 6.9% |
| a | 200574 | 6.9% |
| r | 200360 | 6.9% |
| o | 199420 | 6.9% |
| E | 100418 | 3.5% |
| Other values (8) | 799624 |
credit_score
Real number (ℝ)
Missing
| Distinct | 40920 |
|---|---|
| Distinct (%) | 8.6% |
| Missing | 25000 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 649.87077 |
| Minimum | 300 |
|---|---|
| Maximum | 850 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 300 |
|---|---|
| 5-th percentile | 518.59 |
| Q1 | 596 |
| median | 649.93 |
| Q3 | 703.84 |
| 95-th percentile | 781.87 |
| Maximum | 850 |
| Range | 550 |
| Interquartile range (IQR) | 107.84 |
Descriptive statistics
| Standard deviation | 79.530061 |
|---|---|
| Coefficient of variation (CV) | 0.12237827 |
| Kurtosis | -0.11694106 |
| Mean | 649.87077 |
| Median Absolute Deviation (MAD) | 53.92 |
| Skewness | -0.039704784 |
| Sum | 3.0868862 × 108 |
| Variance | 6325.0306 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 850 | 2895 | 0.6% |
| 632.96 | 43 | < 0.1% |
| 688.16 | 41 | < 0.1% |
| 628.11 | 40 | < 0.1% |
| 636.04 | 40 | < 0.1% |
| 618.87 | 40 | < 0.1% |
| 658.33 | 39 | < 0.1% |
| 646.4 | 39 | < 0.1% |
| 628.91 | 39 | < 0.1% |
| 623.66 | 38 | < 0.1% |
| Other values (40910) | 471746 | |
| (Missing) | 25000 | 5.0% |
| Value | Count | Frequency (%) |
| 300 | 1 | |
| 302.14 | 1 | |
| 302.95 | 1 | |
| 303.68 | 1 | |
| 305.01 | 1 | |
| 310.32 | 1 | |
| 310.37 | 1 | |
| 317.03 | 1 | |
| 318.51 | 1 | |
| 320.84 | 1 |
| Value | Count | Frequency (%) |
| 850 | 2895 | |
| 849.99 | 1 | < 0.1% |
| 849.98 | 2 | < 0.1% |
| 849.97 | 4 | < 0.1% |
| 849.95 | 1 | < 0.1% |
| 849.94 | 1 | < 0.1% |
| 849.93 | 2 | < 0.1% |
| 849.92 | 1 | < 0.1% |
| 849.91 | 2 | < 0.1% |
| 849.87 | 3 | < 0.1% |
repayment_history
Real number (ℝ)
Zeros
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.999366 |
| Minimum | 0 |
|---|---|
| Maximum | 13 |
| Zeros | 67883 |
| Zeros (%) | 13.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 13 |
| Range | 13 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.4142778 |
|---|---|
| Coefficient of variation (CV) | 0.70736312 |
| Kurtosis | 0.51765062 |
| Mean | 1.999366 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.7086332 |
| Sum | 999683 |
| Variance | 2.0001816 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 135562 | |
| 1 | 134979 | |
| 3 | 90208 | |
| 0 | 67883 | |
| 4 | 45184 | 9.0% |
| 5 | 17902 | 3.6% |
| 6 | 5977 | 1.2% |
| 7 | 1755 | 0.4% |
| 8 | 420 | 0.1% |
| 9 | 105 | < 0.1% |
| Other values (4) | 25 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 67883 | |
| 1 | 134979 | |
| 2 | 135562 | |
| 3 | 90208 | |
| 4 | 45184 | 9.0% |
| 5 | 17902 | 3.6% |
| 6 | 5977 | 1.2% |
| 7 | 1755 | 0.4% |
| 8 | 420 | 0.1% |
| 9 | 105 | < 0.1% |
| Value | Count | Frequency (%) |
| 13 | 1 | < 0.1% |
| 12 | 1 | < 0.1% |
| 11 | 3 | < 0.1% |
| 10 | 20 | < 0.1% |
| 9 | 105 | < 0.1% |
| 8 | 420 | 0.1% |
| 7 | 1755 | 0.4% |
| 6 | 5977 | 1.2% |
| 5 | 17902 | 3.6% |
| 4 | 45184 |
transaction_count
Real number (ℝ)
| Distinct | 65 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50.008804 |
| Minimum | 20 |
|---|---|
| Maximum | 85 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 39 |
| Q1 | 45 |
| median | 50 |
| Q3 | 55 |
| 95-th percentile | 62 |
| Maximum | 85 |
| Range | 65 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 7.063937 |
|---|---|
| Coefficient of variation (CV) | 0.14125387 |
| Kurtosis | 0.029693761 |
| Mean | 50.008804 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 0.1439653 |
| Sum | 25004402 |
| Variance | 49.899206 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 50 | 28225 | 5.6% |
| 49 | 28162 | 5.6% |
| 48 | 27773 | 5.6% |
| 51 | 27518 | 5.5% |
| 52 | 26655 | 5.3% |
| 47 | 26514 | 5.3% |
| 46 | 24942 | 5.0% |
| 53 | 24931 | 5.0% |
| 54 | 23119 | 4.6% |
| 45 | 23074 | 4.6% |
| Other values (55) | 239087 |
| Value | Count | Frequency (%) |
| 20 | 2 | < 0.1% |
| 22 | 2 | < 0.1% |
| 23 | 5 | < 0.1% |
| 24 | 13 | < 0.1% |
| 25 | 22 | < 0.1% |
| 26 | 35 | < 0.1% |
| 27 | 62 | < 0.1% |
| 28 | 117 | < 0.1% |
| 29 | 223 | |
| 30 | 346 |
| Value | Count | Frequency (%) |
| 85 | 2 | < 0.1% |
| 84 | 1 | < 0.1% |
| 83 | 2 | < 0.1% |
| 82 | 2 | < 0.1% |
| 81 | 11 | < 0.1% |
| 80 | 7 | < 0.1% |
| 79 | 24 | < 0.1% |
| 78 | 30 | < 0.1% |
| 77 | 50 | |
| 76 | 79 |
spending_ratio
Real number (ℝ)
| Distinct | 8384 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.086239 |
| Minimum | 5 |
|---|---|
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.8 MiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 15.37 |
| Q1 | 29.93 |
| median | 40.02 |
| Q3 | 50.16 |
| 95-th percentile | 64.73 |
| Maximum | 100 |
| Range | 95 |
| Interquartile range (IQR) | 20.23 |
Descriptive statistics
| Standard deviation | 14.875281 |
|---|---|
| Coefficient of variation (CV) | 0.37108198 |
| Kurtosis | -0.16384105 |
| Mean | 40.086239 |
| Median Absolute Deviation (MAD) | 10.12 |
| Skewness | 0.063279303 |
| Sum | 20043120 |
| Variance | 221.27399 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 4880 | 1.0% |
| 39.63 | 176 | < 0.1% |
| 36.95 | 167 | < 0.1% |
| 46 | 165 | < 0.1% |
| 42.57 | 164 | < 0.1% |
| 40.15 | 162 | < 0.1% |
| 38.72 | 162 | < 0.1% |
| 34.26 | 161 | < 0.1% |
| 38.46 | 161 | < 0.1% |
| 39.43 | 160 | < 0.1% |
| Other values (8374) | 493642 |
| Value | Count | Frequency (%) |
| 5 | 4880 | |
| 5.01 | 10 | < 0.1% |
| 5.02 | 11 | < 0.1% |
| 5.03 | 13 | < 0.1% |
| 5.04 | 7 | < 0.1% |
| 5.05 | 10 | < 0.1% |
| 5.06 | 4 | < 0.1% |
| 5.07 | 13 | < 0.1% |
| 5.08 | 8 | < 0.1% |
| 5.09 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 13 | |
| 99.56 | 1 | < 0.1% |
| 99.53 | 1 | < 0.1% |
| 99.39 | 1 | < 0.1% |
| 99.17 | 1 | < 0.1% |
| 98.62 | 1 | < 0.1% |
| 98.3 | 1 | < 0.1% |
| 97.87 | 1 | < 0.1% |
| 97.61 | 1 | < 0.1% |
| 97.48 | 1 | < 0.1% |
join_date
Date
| Distinct | 3650 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.8 MiB |
| Minimum | 2015-01-01 00:00:00 |
|---|---|
| Maximum | 2024-12-28 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 424435 | |
| 1 | 75565 | 15.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 424435 | |
| 1 | 75565 | 15.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 424435 | |
| 1 | 75565 | 15.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 500000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 424435 | |
| 1 | 75565 | 15.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 500000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 424435 | |
| 1 | 75565 | 15.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 500000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 424435 | |
| 1 | 75565 | 15.1% |
Interactions
Correlations
| age | annual_income | credit_score | customer_id | default_flag | education_level | employment_type | gender | loan_amount | loan_purpose | region | repayment_history | spending_ratio | transaction_count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| age | 1.000 | 0.001 | 0.001 | 0.000 | 0.004 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | -0.002 | 0.001 | 0.001 |
| annual_income | 0.001 | 1.000 | -0.001 | 0.001 | 0.000 | 0.001 | 0.000 | 0.003 | -0.002 | 0.001 | 0.000 | 0.000 | 0.002 | 0.001 |
| credit_score | 0.001 | -0.001 | 1.000 | -0.002 | 0.000 | 0.000 | 0.003 | 0.000 | 0.001 | 0.000 | 0.000 | -0.001 | 0.001 | 0.001 |
| customer_id | 0.000 | 0.001 | -0.002 | 1.000 | 0.004 | 0.000 | 0.000 | 0.000 | -0.001 | 0.002 | 0.000 | -0.001 | -0.000 | -0.001 |
| default_flag | 0.004 | 0.000 | 0.000 | 0.004 | 1.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.004 | 0.000 | 0.000 | 0.000 |
| education_level | 0.002 | 0.001 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.002 | 0.001 | 0.003 |
| employment_type | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 | 1.000 | 0.001 | 0.001 | 0.002 | 0.002 | 0.001 | 0.000 | 0.000 |
| gender | 0.000 | 0.003 | 0.000 | 0.000 | 0.003 | 0.000 | 0.001 | 1.000 | 0.005 | 0.005 | 0.002 | 0.000 | 0.002 | 0.001 |
| loan_amount | 0.000 | -0.002 | 0.001 | -0.001 | 0.000 | 0.000 | 0.001 | 0.005 | 1.000 | 0.001 | 0.001 | 0.001 | 0.002 | -0.001 |
| loan_purpose | 0.000 | 0.001 | 0.000 | 0.002 | 0.000 | 0.001 | 0.002 | 0.005 | 0.001 | 1.000 | 0.000 | 0.002 | 0.002 | 0.000 |
| region | 0.002 | 0.000 | 0.000 | 0.000 | 0.004 | 0.000 | 0.002 | 0.002 | 0.001 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 |
| repayment_history | -0.002 | 0.000 | -0.001 | -0.001 | 0.000 | 0.002 | 0.001 | 0.000 | 0.001 | 0.002 | 0.000 | 1.000 | -0.000 | 0.001 |
| spending_ratio | 0.001 | 0.002 | 0.001 | -0.000 | 0.000 | 0.001 | 0.000 | 0.002 | 0.002 | 0.002 | 0.000 | -0.000 | 1.000 | -0.002 |
| transaction_count | 0.001 | 0.001 | 0.001 | -0.001 | 0.000 | 0.003 | 0.000 | 0.001 | -0.001 | 0.000 | 0.000 | 0.001 | -0.002 | 1.000 |
Missing values
Sample
| customer_id | age | gender | region | education_level | employment_type | annual_income | loan_amount | loan_purpose | credit_score | repayment_history | transaction_count | spending_ratio | join_date | default_flag | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 100000 | 59.0 | Female | East | Secondary | Salaried | 240079.54 | 92740.78 | Education | 784.17 | 1 | 53 | 41.30 | 2016-10-06 | 0 |
| 1 | 100001 | 49.0 | Female | South | Graduate | Self-Employed | 438923.30 | 64315.33 | Car | 589.70 | 1 | 62 | 12.19 | 2022-04-01 | 0 |
| 2 | 100002 | 35.0 | Female | North | Secondary | Salaried | 424122.06 | 632481.94 | Home | 625.79 | 2 | 45 | 23.68 | 2024-12-24 | 0 |
| 3 | 100003 | 63.0 | Female | North | Secondary | Salaried | 322274.92 | 118465.97 | Car | 627.48 | 2 | 57 | 32.66 | 2021-03-17 | 0 |
| 4 | 100004 | 28.0 | Female | East | Graduate | Self-Employed | 1371925.76 | 131836.27 | Car | 803.18 | 1 | 46 | 15.40 | 2024-04-25 | 1 |
| 5 | 100005 | NaN | Male | North | Graduate | Salaried | 140205.29 | 636816.05 | Car | NaN | 2 | 63 | 47.89 | 2020-08-11 | 0 |
| 6 | 100006 | NaN | Male | South | Secondary | Salaried | NaN | 54893.56 | Education | 618.81 | 0 | 39 | 30.48 | 2022-01-12 | 0 |
| 7 | 100007 | 39.0 | Male | North | Primary | Salaried | 884965.70 | 112443.67 | Education | 789.83 | 2 | 48 | 48.40 | 2017-10-23 | 0 |
| 8 | 100008 | 43.0 | Male | North | Graduate | Salaried | 410172.80 | 207545.02 | Car | 557.85 | 2 | 61 | 35.56 | 2020-12-02 | 0 |
| 9 | 100009 | 31.0 | Male | East | Graduate | Self-Employed | 731759.92 | 213505.15 | Business | 596.20 | 2 | 56 | 29.38 | 2019-08-07 | 0 |
| customer_id | age | gender | region | education_level | employment_type | annual_income | loan_amount | loan_purpose | credit_score | repayment_history | transaction_count | spending_ratio | join_date | default_flag | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 499990 | 599990 | 41.0 | Male | East | Secondary | Self-Employed | 397384.54 | 233648.49 | Home | 771.71 | 0 | 57 | 39.47 | 2022-02-24 | 0 |
| 499991 | 599991 | 56.0 | Male | North | Graduate | Salaried | 373832.46 | 176198.19 | Car | 602.24 | 0 | 38 | 30.83 | 2018-03-21 | 0 |
| 499992 | 599992 | 35.0 | Female | East | Graduate | Salaried | NaN | 343362.36 | Car | 532.31 | 2 | 57 | 44.99 | 2019-09-25 | 0 |
| 499993 | 599993 | 67.0 | Female | West | Secondary | Salaried | 1536894.34 | 199143.32 | Car | 605.07 | 0 | 45 | 42.09 | 2020-07-27 | 0 |
| 499994 | 599994 | 67.0 | Male | South | Primary | Self-Employed | 374245.04 | 279921.48 | Education | 692.43 | 5 | 46 | 48.84 | 2016-11-27 | 0 |
| 499995 | 599995 | 31.0 | Female | West | Graduate | Salaried | 591909.32 | 89253.73 | Education | 627.74 | 0 | 37 | 59.93 | 2024-03-08 | 1 |
| 499996 | 599996 | 63.0 | Male | North | Secondary | Unemployed | 983386.51 | 119731.07 | Business | 771.31 | 2 | 59 | 11.37 | 2016-06-22 | 0 |
| 499997 | 599997 | 63.0 | Other | South | Secondary | Salaried | 280465.76 | 340991.05 | Education | 663.07 | 2 | 55 | 48.06 | 2021-04-16 | 0 |
| 499998 | 599998 | 31.0 | Male | North | Primary | Salaried | 304002.49 | 75333.63 | Car | 718.97 | 4 | 58 | 37.98 | 2024-05-08 | 0 |
| 499999 | 599999 | 39.0 | Female | East | Secondary | Salaried | 259383.79 | 480386.08 | Car | NaN | 1 | 39 | 22.53 | 2023-07-16 | 0 |